Mining Network Logs: Information Quality Challenges
نویسندگان
چکیده
Network logs are the key to many critical functions such as network security, network monitoring and network management. They play an important role in intrusion detection and early warning for potential worm and virus attacks. However, network log data are under-utilized -largely ignored until the occurrence of an event that requires back tracking for diagnostic purposes. There are two main reasons why network logs are not subject to more rigorous analysis – the sheer volume and the inherent information quality challenges. In this paper, we use the context of classical data quality principles to outline some of the issues that we encountered, and the solutions that we devised, while working on a real network management application involving large amounts of network log data. While our discussion is centered on our case study, the problems we encounter and the solutions we devise are general and apply to a wide array of network log data and applications.
منابع مشابه
Problems and Challenges When Implementing a Best Practice Approach for Process Mining in a Tourist Information System
The application of process mining techniques for analyzing customer journeys seems promising for different stakeholders in the tourism domain, i.e., the tourism providers are enabled to, e.g., find nice offers or partner services and the guests can improve their holiday experience. One precondition for mining processes (high quality) logs. This paper reports on experiences in implementing a dat...
متن کاملGraph or Relational Databases: A Speed Comparison for Process Mining Algorithm
Process-Aware Information System (PAIS) are IT systems that manages, supports business processes and generate large event logs from execution of business processes. An event log is represented as a tuple of the form CaseID, TimeStamp, Activity and Actor. Process Mining is an emerging area of research that deals with the study and analysis of business processes based on event logs. Process Minin...
متن کاملDiscovering Emerging Patterns for Anomaly Detection in Network Connection Data
Most intrusion detection approaches rely on the analysis of the packet logs recording each noticeable event happening in the network system. Network connections are then constructed on the basis of these packet logs. Searching for abnormal connections is where the application of data mining techniques for anomaly detection promise great potential benefits. Anyway, mining packet logs poses addit...
متن کاملWanna Improve Process Mining Results? It’s High Time We Consider Data Quality Issues Seriously
The growing interest in process mining is fueled by the increasing availability of event data. Process mining techniques use event logs to automatically discover process models, check conformance, identify bottlenecks and deviations, suggest improvements, and predict processing times. Lion’s share of process mining research has been devoted to analysis techniques. However, the proper handling o...
متن کاملSafelog: Supporting Web Search and Mining by Differentially-Private Query Logs
Query logs can be very useful for advancing web search and web mining research. Since these web query logs contain private, possibly sensitive data, they need to be effectively anonymized before they can be released for research use. Anonymization of query logs differs from that of structured data since they are generated based on natural language and the vocabulary (domain) is infinite. This u...
متن کامل